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Abstract 

We introduce a new additive sweeping preconditioner for the Helmholtz equation based on 
the perfect matched layer (PML). This method divides the domain of interest into thin layers 
and proposes a new transmission condition between the subdomains where the emphasis is on 
the boundary values of the intermediate waves. This approach can be viewed as an effective 
approximation of an additive decomposition of the solution operator. When combined with 
the standard GMRES solver, the iteration number is essentially independent of the frequency. 
Several numerical examples are tested to show the efficiency of this new approach. 
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1 Introduction 

Let the domain of interest be D = (0, l) d where d = 2, 3. The Helmholtz equation is 

Au(x) + ~^—u(x) = f(x), Mx e D, (1) 

c z (x) 

where u(x) is the time-independent wave field generated by the time-independent force f(x), ui 
is the angular frequency and c(x) is the velocity field. Commonly used boundary conditions are 
the approximations of the Sommerfeld radiation condition. By rescaling the system, we assume 
Cmin < c(x) < c max where c m j n and c max are of 0(1). Then cj/(2n) is the typical wave number and 
A = 2tt/uj is the typical wavelength. 

Solving the equation numerically is challenging in high frequency settings for two reasons. First, 
in most applications, the equation is discretized with at least a constant number of points per 
wavelength, which makes the number of points in each direction n = fl(w) and the total degree 
of freedom N = n d = f l(co d ) very large. Second, the system is highly indefinite and has a very 
oscillatory Green’s function, which makes most of the classical iterative methods no longer effective. 

There has been a sequence of papers on developing iterative methods for solving (|T]) . The AILU 
method by Gander and Nataf [TO] is the first to use the incomplete LU factorization to precondi¬ 
tion the equation. Engquist and Ying 00 developed a series of sweeping preconditioners based on 
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approximating the inverse of the Schur complements in the LDU factorization and obtained essen¬ 
tially w-independent iteration numbers. In ESI. Stolk proposed a domain decomposition method 
based on the PML which constructs delicate transmission conditions between the subdomains by 
considering the “pulses” generated by the intermediate waves. In ESI. Vion and Geuzaine proposed 
a double sweep preconditioner based on the Dirichlet-to-Neumann (DtN) map and several numer¬ 
ical simulations of the DtN map were compared. In im Chen and Xiang introduced a source 
transfer domain decomposition method which emphasizes on transferring the sources between the 
subdomains. In }20| . Zepeda-Nuhez and Demanet developed a novel domain decomposition method 
for the 2D case by pairing up the waves and their normal derivatives at the boundary of the subdo¬ 
mains and splitting the transmission of the waves into two directions. Most recently in E3. Liu and 
Ying proposed a recursive sweeping preconditioner for 3D Helmholtz problems. Other progresses 
includes USES COS DU and we refer to 0 by Erlangga and [5] by Ernst and Gander for a complete 
discussion. 

Inspired by m and these previous approaches, we propose a new domain decomposition method 
in this paper which shares some similarities with El ESI- The novelty of this new approach is that 
the transmission conditions are built with the boundary values of the intermediate waves directly. 
For each wave field on the subdomains, we divide it into three parts - the waves generated by 
the force to the left of the subdomain, to the right of the subdomain, and within the subdomain 
itself. This corresponds to an L + D + U decomposition of the Green’s matrix G as the sum of 
its lower triangular part, upper triangular part and diagonal part. This is why we call this new 
preconditioner the additive sweeping preconditioner. 

The rest of this paper is organized as follows. First in Section[2]we use the ID case to illustrate 
the idea of the method. Then in Section [3] we introduce the preconditioner in 2D and present the 
2D numerical results. Section 0] discusses the 3D case. Conclusions and some future directions are 
provided in Section [5] 


2 ID Illustration 

We use the pml m mug to simulate the Sommerfeld condition. The PML introduces the auxiliary 
functions 


a( x ) 


s(x) 




\ 2 

X — 7]\ 

v ) 

x e [o, 77), 


x e [ 77 , 1 - ??], 

X - 1 + T]\ 2 

V ) 

X G (I-??, 1 ], 

a ( x ) \ 1 



u; ) 


where C is an appropriate positive constant independent of u>, and rj is the PML width which is 
typically around one wavelength. 
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The Helmholtz equation with PML in ID is 

j + c%y) = yx € (0,1); 

I u(0) = 0, 

U(i) = o. 


We discretize the system with step size h = l/(n + 1), then n is the degree of freedom. With the 
standard central difference numerical scheme the discretized equation is 


S* / Sj+l/2 

h V h 


{Ui+1 


- Ui) 


Si-1/2 

h 



H 2 u i — fit 
c: 


VI < * < n, 


( 2 ) 


where the subscript i means that the corresponding function is evaluated at x = ih. 

We denote Equation ([2]) as Au = /, where u and / are the discrete array of the wave field and 
the force 


^ : — [ui ) • • •; u n ] , f : — [/l, • • • ) fn] ■ 

In ID, A is tridiagonal and Equation © can be solved without any difficulty. However, here we 
are aiming at an approach which can be generalized to higher dimensions so the rest of this section 
takes another point of view to solve p]) instead of exploiting the sparsity structure of A directly. 

With the Green’s matrix G = A~, u can be written asu = Gf. Now let us divide the discrete 
grid into to parts. We assume that ?y = 7 h and n = 2y + mb — 2 where 7 and b are some small 
constants and to is comparable to n, and we define 


X\ := {ih :l<*<7 + &— 1}, 

X p := {ih : 7 + (p - 1)6 < i < 7 + pb — 1}, p = 2,..., m — 1, 
X m := {ih : 7 + (to — 1 )b < i < 27 + mb — 2}, 


which means, Xi is the leftmost part containing the left PML of the original problem and a small 
piece of grid with b points, X m is the rightmost part containing the right PML and a grid of b 
points, and X p ,p = 2,... ,m — 1 are the middle parts each of which contains b points. u p and f p 
are defined as the restrictions of u and / on X p for p = 1,... ,m, respectively, 


Ul . - [^1 , • • • 5 ^7 + 6- 

1 T 

-lj ^ 


Up := [^7+(p—1)6> • • 

, , P = 2, . 

. , TO — 1 

Um • [^7+(m_ Vjb'i • 

„ 1 T 

• 1 r U j 2 ')-\-mb— 2 \ 5 


fi := [/1, • • •, /7+b- 

IT 

lj > 


f P ’ [fy+(p—l)bi • • 

j f~f+pb— l] 5 P = 2 , . . 

■ ,m— 1 , 

f m := • ■ 

• 5 /27+m6—2] • 



Then u = Gf can be written as 


"«i" 


'Gi,i 

g 1i2 . 

Gi, m 

u 2 

= 

g 2i1 

G 2 ,2 . 

G 2>m 

Um_ 


_Gm,l 

G m , 2 



/1 
f 2 


./ 


m 
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By introducing u VA := G p ^ q f q for 1 < p, q < m, one can write u p = J2™=i u p,q- The physical 
meaning of u p ^ q is the contribution of the force f q defined on the grid X q acting upon the grid X p . 
If we know the matrix G , the computation of u p , q can be carried out directly. However, computing 
G , or even applying G to the vector /, is computationally expensive. The additive sweeping method 
circumvent this difficulty by approximating the blocks of G sequentially and the idea works in higher 
dimensions. In what follows, we shall use to denote the approximations of u p , q . 


2.1 Approximating u p q with auxiliary PMLs 

2.1.1 Wave generated by fi 


The components u v \ for p = 1,..., m can be regarded as a sequence of right-going waves generated 
by/i- Note that the boundary condition of the system is the approximated Sommerfeld condition. If 
we assume that the reflection during the transmission of the wave is negligible, then, to approximate 
Upi, we can simply put an artificial PML on the right of the grid X-[ to solve a much smaller problem, 
since the domain of interest here is only A'i (see Figure [2(b) [). To be precise, we define 


o x (x) := < 


C / x — 77 

71 \ 71 

0 , 

C ( x — (77 + (b — 1 )h) 
V 


V 


*f(z) := 1+i 


.<{*) 


-1 


x G [0,??), 

x G [ 77 , 77 + (b- l)h], 
x G (77 + (6 - 1 )h, 2ri + (b - 1 )h\, 


We consider a subproblem on the auxiliary domain D^ := (0, + (b — 1 )h) 


v(x) = 0 , 


Vx G Df, 
\x G dD?. 


With the same discrete numerical scheme and step size h , we have the corresponding discrete system 
JTj vf i> = g on the extended grid 


:= {ih : 1 < i < 2q + b - 2}. 

Figure [l] shows a graphical view of Xf 1 , as well as other extended grids which we will see later. 

With the discrete system H^v = g, we can define an operator Gf 1 : y —y z, which is an 
approximation of Gpi, by the following: 

1. Introduce a vector g defined on X^ 1 by setting y to Xi and zero everywhere else. 

2. Solve H^v — g on X^ 1 . 

3. Set z as the restriction of v on X±. 
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Figure 1: This figure shows how the grids X p are extended with auxiliary PMLs. 


Then can be set as 


«i,i := G^f i. 

Once we have computed Uip, we can use the right boundary value of to compute u 2 ,1 by 
introducing an auxiliary PML on the right of X 2 and solving the boundary value problem with the 
left boundary value at x = (7 + b— 1 )h equal to the right boundary value of U 17 . The same process 
can be repeated to compute {t p+ i,i by exploiting the right boundary value of u p ,\ recursively for 
p = 2,... ,m — 1 (see Figure [2(c) [ ). In the following context of this section, we introduce notations 
g L , g R for a vector array g = [g 1 , - - -, g s ] T by 

9 L ~9i, 9 R ~9s , 

where g L and g R should be interpreted as the leftmost and the rightmost element of the array g. 

To formalize the definition of u p ,i for each p = 2 ,m, we introduce the auxiliary domain 
D R , which will be defined below, to simulate the right-transmission of the waves. The superscript 
R means that the auxiliary domain is intended for approximating the right-going waves. The left 
boundary of D R will be denoted as d L D R , on which the boundary value will be used to approximate 
the wave transmission as we shall see. We also extend X p with an auxiliary PML on the right to 
form an extended grid X R (see Figure [lj), which corresponds the discretization of D R . To be 
specific, we define 

Dp ■= (V + ((P - 1)& - 1)*-, 2r? + (pb - 1 )h), 
d L D R -.= {g+{{p~l)b-l)h}, 

X R := {ih : 7 + (p - 1)6 < * < 2y + pb - 2}. 

Note that the grid X R is X rn itself since X m already contains the original right PML region. The 
purpose to use the notation X R is to simplify the description of the algorithm. 


5 
















(b) Ui } i is computed by introducing an auxiliary PML on the right of X\. The dotted gray 
arrow stands for the restriction of u\ on X2 U • • • U X m , which is to be approximated. 



(d) u p ,m for p = m,..., 1 are computed sequentially. 



Xi X, x q+1 x m 

(e) u P ,q are computed for p = q first, and then for p = q + 1.... ,m and for p = q — 1 , ..., 1 sequentially. 


Figure 2: This figure shows how u Ptq are generated. The direction of the arrows indicates the 
computing orders of the approximating waves. 


For the PML on Dp, we define 



ro, 

S C (x — (r]+ (pb- 1 )h) 

[v \ V 
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x £ [t?+ ((p- 1 )b— 1 )h,r) + {pb - 1 )h], 
x G (77 + {pb — 1 )h, 2 rj + {pb - l)/i], 




We consider the following subproblem 

(<»?W^ )2 + ^)»M = o, v*eD* 

v(x) = w, Vz £ d L D p , 

w(z) = 0, Vz £ \ d L T>*, 

where w is the left boundary value of the unknown v(x). We define H p v = g as the discretization 
of this problem on X p where the right-hand side g is given by g := (— l/h 2 )[w, 0,..., 0] T as a 
result of the central discretization. The subproblem H p v = g for each p = 2,... ,m induces the 
approximation operator G p : w —> z by the following procedure: 

1. Set g = (— l/h 2 )[w, 0,..., 0] T . 

2. Solve HpV = g on X p . 

3. Set z as the restriction of v on X p . 

Then u p -\ can be defined recursively for p = 2,..., m by 

u p ,i G p u p- 1,1- 

Note that, the operator G p is not an approximation of the matrix block G Pi i, since G p maps the 
right boundary value of u p -i,i to while G p \ maps f 1 to u p p. 

2.1.2 Wave generated by f m 

The components u pm for p = 1 ,,m can be regarded as a sequence of left-going waves generated 
by fm. The method for approximating them is similar to what was done for fi (see Figure [2(d)] ). 
More specifically, for u m ,m we define 

D m := 

yM_ 


O'™ {X) ■= 


: = 

We consider the continuous problem 

| + ^)) v W = 9(x), Vz £ D 

[z(z)= 0 , VxedD™, 

and define H^v = g as its discretization on X^. The operator : y z can be defined as: 


(1 - 2 ?y — (6 - 1 )M), 

{ih : (■m — 1)6 + 1 < * < 2 y + mb — 2 } 
C f x — (1 — t] — (6 — 1 )h ) N 2 
V 


V 
0 , 
c 

V 

1 + i 


- (1 -v) 


Cm 0*0 
UJ 


x G [1 — 2rj — (b — 1 )/i, 1 — rj — (b — 1 )h), 
x £ [1 - 77 - (6 — 1 ) 6,1 — rj\, 
x £ (1 — » 7 , 1 ], 
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1. Introduce a vector g defined on by setting y to X m and zero everywhere else. 

2. Solve v = g on X%[. 

3. Set z as the restriction of v on X. m . 

Then 


For each u Pi m,P = 1,. .. ,m — 1, we introduce the auxiliary domain D p , the right boundary 
d R Dp, the extended grid X R , and the corresponding PML functions Up(x),Sp(x ) as follows 

Dp := ({p - l)bh,r] + pbh), 
d R D R := {rj + pbh}, 

X R := {xi : {p - 1)6 + 1 < i < 7 + pb - 1}, 

x £ [(p — 1 )bh, 77 + (p — l)6/i), 
a; € [77 + (p — l)6/i, 77 + p6/i], 


Vcc G Zip, 

Vcc G 

Vcc G dD R \ d R D R , 

where y is the right boundary value of v(x). Let H R v = g be its discretization on X R with 
g := (—l//i 2 )[0,...,0, w] T . We introduce the operator G p :w\Z by: 

1. Set g = (—l//i 2 )[0,..., 0, w} T . 

2. Solve H R v = g on X T p . 

3. Set z as the restriction of v on X p . 

Then can be defined recursively for p = m — 1,..., 1 by 



<r p ( x ) : = { V 


Sp(X) := (1+ 


and we consider the continuous problem 


cl , 2 . or 


{S r (x) dx r + 
v(x) = w, 

}{x) = 0 , 


2 (z) 


v(x) = 0 , 




2 . 1.3 Wave generated by f q for q = 2 ,..., m — 1 

For each q , the components u p . q for p = 1 can be regarded as a sequence of left- and 

right-going waves generated by f q (see Figure [2(e)]). For u q , q , we introduce 


Dq 1 ■= ((<? - l)bh, 2 g + ( qb - 1 )h), 

Xq 1 := {Xi : (q — 1)6 + l<i<2'y + qb— 2}, 


C ( x- (77 + (q - 1 )bh) y 


<T q (x) : = 


V 


C ( x — (77 + (qb — 1 )h) 
V 


s q (x) := 1 + i 


.<(*) 


V 

-1 


00 


x £ [(q- l)6/i, 77 + (g — 1)66), 
x € [?7 + (q - l)6/i, 77+ (qb — 1)6], 
x £ (77 + (qb — 1)6, 277 + (<76 - 1)6], 


and define H q I v = g as the discrete problem of the continuous problem 

| ( (s " w ^ )2 + = s(x) - v * e 
[x(x)=0, Mx£dD 

We introduce the operator G q 1 : y —► z as: 

1. Introduce a vector g dehned on X q J by setting y to X q and zero everywhere else. 

2. Solve Hjfv = g on X\f. 

q a q 

3. Set z as the restriction of v on I,. 

Then 


— nM f 

u q,q vr,j J q . 

Following the above discussion, the remaining components u Ptq are defined recursively as 


u p ,q : = Gp Up- l q , for p = q + 1,... ,m, 
Up.q ■■= G r p u p+l q, for p = q — 1,... ,1. 


2.2 Accumulating the boundary values 


After all the above are done, an approximation of u p is given by (see Figure 3(a)) 


U p . ^ ) 76 , 9 > TP — 1 , • • • , 777 . 

9=1 
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U 1 Un U 3 Um- 2 u m - 1 u m 

4 4 4 4 4 4 



(a) lip is a superposition of iip,q, q = 1 ,... ,m. 


Ui U 2 U] Um -2 Um— 1 

4 4 4 4 4 4 



(b) Up is a superposition of u p ,i :p _i, u Pip and u p , p +i : m 


Figure 3: This figure shows how the boundary values are accumulated after each step. The thin 
arrows indicate the transmission directions of the waves. The bold, up-pointing arrows symbolizes 
that summing up the corresponding waves on X p gives the superposition wave u p . 

In the algorithm described above, the computation of each component u Ptq requires a separate 
solution of a problem of form H p v = g or H p v = g. Since there are 0(m 2 ) such components, the 
algorithm is computationally expensive. A key observation is that the computation associated with 
each p can be combined in one single shot by accumulating the boundary values of the waves. More 
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precisely, we define 


92 

Up, qi -.q 2 := 'y ' U p ,t, 

*=9l 

which is the total contribution of the waves generated by f qi ,..., f q , 2 restricted to the grid X p . 
The quantity u Pl i :p -i, which is the total right-going wave generated by fi,... ,f p ~ 1 upon X p , can 
be computed sequentially for p = 2 ,..., m without computing each component and then adding 
them together as we described above, as long as we accumulate the boundary values after each 
intermediate step. Specifically, we first compute u q , q = G q ! f q for q = 1,..., to. This step is similar 
to what we did above. Then, to compute u Pi i :p _i we carry out the following steps 

u p ,i :p -i = G p u p _i 1 . p _ 1 , u p l . p = u p i :p _i T- u p , pl for p = 2,..., to. 

This means, before computing the total right-going wave « p +i,i ;p on subdomain X p+ i, the boundary 
values of the previous right-going waves, w p . 1 . p _ 1 and u p p , are added together, so that the the 
current right-going wave u p +i,i :p can be computed in one shot, eliminating the trouble of solving 
the subproblems for many times and adding the results together (see Figure [3(b)] ). 

For the left going waves u p , p +i-. m , a similar process gives rise to the recursive formula 

Up ,p+l\m = GpUp +1 p +1 . m , U p p . m = U p p + U p p+1 . m , for p = TO — 1, . . . , 1. 

Finally, each u p can be computed by summing it P! i :p _i, u p p and u PjP +i :m together (for the 
leftmost and the rightmost one, u± and u m , only two terms need to be summed), i.e., 

^1 = itl,l + Ui, 2 :mi 

Up — Up^i : p—i T Up^ p T fi PlP -}-l:m: V — 2, . . . , TO 1, 

Um — U m ,l:m—1 T U mi m- 

We see that, by accumulating the boundary values after each intermediate step, we only need 
to solve 0(m) subproblems instead of 0(m 2 ). 

In this algorithm, the approximation u p on each small subdomain is divided into three parts. 
From a matrix point of view, this is analogous to splitting the block matrix G into its lower triangular 
part, diagonal part and upper triangular part, and then approximating each part as an operator to 
get the intermediate waves and then summing the intermediate results together. This is why we 
call it the additive sweeping method. 

Equation ^ shows an analogy of this procedure, where the matrix G is split into 3 to — 2 blocks, 
each of which corresponds to a subproblem solving process: 

Uq q ~ Uq^q = G q q f q , <7=1,..., TO, 

P-1 

Up, 1: P —1 ~ Up, 1 ;p — 1 = y ' Gp, q f q , p = 2, . . . , TO, 

9=1 

m 

U p , p -\-\\rn ~ H p , P +l:m — ^ ^ Gp, q f ql p 1, . . . ,TO 1. 

q= P +l 
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( 3 ) 


Ui 


«1,1 + Ul,2:m 


[ Gr.r 

G'l 2 


Gi iTO 


r/ii 

U 2 


U2,l + U2,2 + W2,3:m 


G 2 ,i 

G 2j 2 

G 2j 3 

G 2 , m 


f 2 








y , m_ 


1 T 


Gm, 1 


Gm,m— 1 

Gm,m 


_f m_ 


When combined with standard iterative solvers, the approximation algorithm serves as a pre¬ 
conditioner for Equation © and it can be easily generalized to higher dimensions. In the following 
sections, we will discuss the details of the algorithm in 2D and 3D. To be structurally consistent, 
we will keep the notations for 2D and 3D the same with the ID case without causing ambiguity. 
Some of the key notations and concepts are listed below as a reminder to the reader: 


















{X p }™ =1 The sliced partition of the discrete grid. 

{D^ 1 } 1 £=i The auxiliary domains with two-sided PML padding. 

The auxiliary domains with right-side PML padding. 
The auxiliary domains with left-side PML padding. 


2 

{D L P l^i 1 


M 


{Xq }^_ 1 X q with two-sided PML padding, the discretization of D t 
{Xp}™ 2 X p with right-side PML padding, the discretization of Dp. 


*-p ip —2 

r yL\m—l 
l A p Jp= 1 


X p with left-side PML padding, the discretization of D 


{G q }^L 1 The auxiliary Green’s operators each of which maps the force on X q to the 
approximation of the wave field restricted to X q . 


• {G^}^_ 2 The auxiliary Green’s operators each of which maps the left boundary value to 
the approximated wave field restricted to X p , which simulates the right-transmission of the 
waves. 


• {Gp jp™^ 1 The auxiliary Green’s operators each of which maps the right boundary value 
to the approximated wave field restricted to X p , which simulates the left-transmission of the 
waves. 


3 Preconditioner in 2D 

3.1 Algorithm 

The domain of interest is D = (0, l) 2 . We put PML on the two opposite sides of the boundary, 
X 2 = 0 and a; 2 = 1, to illustrate the idea. The resulting equation is 

I (^df + (s(x 2 )d 2 ) 2 + u(x) = f(x), Vx = (x!,x 2 ) £ D, 

I u(x) = 0, Mx £ dD, 
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We discretize D with step size 6 = 1 j(n + 1) in each direction, which results the Cartesian grid 

X := {(iih,i 2 h) : 1 < ii,i 2 < n}, 


and the discrete equation 


s h ( 

'S 72 + 1/2 /n 

h\ 

7 (1*71,72 + 1 **71 

'71+1,72 

21 X 21,^2 ^1 — 1,12 

6 2 


s i 2 —1/2 


(l*7i ,12 l*7i -i’2 — 1 


2 1 * 71,72 — fil ,'42 ■ * 1;*2 ^ n , 

C il,*2 


( 4 ) 


where the subscript (+i,'<2) means that the corresponding function is evaluated at (iih : i 2 h), and 
since s(x 2) is a function of x 2 only, we omit the i\ subscript, u and / are defined to be the 
column-major ordering of the discrete array u and / on the grid X 

U •— [l*l,l; • • • ; 1; • • ■ t l*7i,n] ; X • — 1,1; • * • ; fn,l ; * • * ; fn,n\ 

Now Q can be written as Au = /. 

We divide the grid into m parts along the x 2 direction 


X\ := {(iih, i 2 h) : 1 < i\ < n, 1 < i 2 < 7 + 6 - 1}, 

X p := {(iih, i 2 h) : 1 < i\ < n, 7 + (p — 1)6 < i 2 < 7 + p6 — 1}, p = 2,..., m — 1, 

X m := {(ii/i, i 2 h) : 1 < *i < n, 7 + (m — 1)6 < ?2 <27 + mb — 2}, 


and we define u p and / p as the column-major ordering restriction of u and / on X p 

«1 * [l*l,l; • • ■ ; 1 * 77 , 1 ; • • * ; 1 * 71 , 7 + 6 — l] ; 

Up != , l*n,7+(p— 1)6; • • • ; l*n,7+p6— l] ; P = 2 , . . . , 771 1 , 

l*m. ■ [**l, 7 +( 7 n— 1 ) 6 ; * • * ; 1 * 77 , 7 +(tt 7 — 1 ) 6 ; ■ ■ • ; **n,27+7776—2] ; 

/1 := [/l,l; • • • ) /ti, 1; • • • ; /ti, 7+6—l] ; 

/p ' = [/l,7+(p— 1 ) 6 ; ■ • ■ ; /ti,7+(p— 1 ) 6 ; ■ • • ; fn,j+pb— l] ; P = 2 , . . . , TO 1 , 
X771 ■ [/l,7+(m—1)6; • • ■ ; fn,'y+(m— 1)6; • ■ ■ ; fn, 27+7716— 2] , 

then w = G/ for G = A -1 can be written as 


"«i" 


'G 1;1 

G+2 • 

Gi, m 

1*2 

= 

G2+ 

G2,2 

G 2t m 

Um_ 


G m ,l 

G m ,2 

• Gm,m_ 


7i 

/2 

y m 


Auxiliary domains. Following to the ID case, the extended subdomains and the corresponding 
left and right boundaries are defined by 

Dq 1 = (0; 1) x {(q - l)6/i, 27) + (qb - 1 )6), q=l,...,m, 

D p = (0, 1) x (?) + ((p - 1)6 - 1)6, 2?7 + (pb - 1)6), p = 2,...,m, 

Dp = (0,1) x ({p- 1)66,7) + p66), p= l,...,m- 1, 

<9 L -D^ = (0,1) x {?) + ((p - 1)6 - 1)6}, p = 2 ,..., to, 

= ( 0 , 1 ) x {?) +p 66 }, p = 1 ,..., to — 1 . 
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The extended grid for these domains are 

X™ := {(iih, i 2 h) : 1 < i\ < n, (q - 1)6 + 1 < i 2 < 2y + qb - 1}, q = l,...,m, 

Xp := {(iih, i 2 h) : 1 < *i < n,7 + (p — 1)6 < i 2 < 2y + pb - 2}, p = 2,... ,m, 

Xp := {(iih, i 2 h) : 1 < *i < n, (p — 1)6 + 1 < i 2 < 7 + pb — 1}, p = l,-».., m — 1. 

Auxiliary problems. For q = 1 ,m, we define H^v = g to be the discretization on X^ of 

the problem 

f (dl + («f (X 2 )d 2 f + v ( x ) = 9 {x), Va; G D ™, 

|v(a;)=0, VxedD™. 

For p = 2,... ,m, H R v = g is the discretization on X R of the problem 

^1 + (Sp (x 2 )d 2 ) 2 + v(x) =0, Va; G D R , 

v(x ) = to(ari), Va; G d L D R , 

v(x ) =0, Va; G dD R \ d L D R , 

where p := (—l/h 2 )[iy T , 0,..., 0] T and w := [u>i,..., w n ] T is the discrete value of w(xi). Finally, 
for p = 1 ,,m— 1, HpV = g is the discretization on X R of the problem 

(&1 + (Sp ( x 2 )d 2 ) 2 + v(x) =0, Va; G Dp , 

v(x) = w(x 1), Va 'Gd R Dp, 

v(x) = 0, Va; G dD p \ d R D p , 

where g := (— l/h 2 )[0 ,... , 0 ,w T ] T and w := ..., w n ] T . 

Auxiliary Green’s operators. For q = 1 we define G q r : y 1—► 2 to be the operator 

defined by the following operations: 

1. Introduce a vector g defined on X^ by setting y to X q and zero everywhere else. 

2. Solve Hffv = g on X%*. 

3. Set z as the restriction of v on X q . 

For p = 2,..., m, the operators G R : zt? 1—>- is given by: 

1. Set g = (—l/Ji 2 )[w T , 0,..., 0] T . 

2. Solve H R v = g on X R . 

3. Set z as the restriction of v on X p . 
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Finally, for p = 1,..., m — 1, G R : w z is defined as: 

1. Set g = (—l//i 2 )[0,. .., 0,w T ] T . 

2. Solve HpV = g on X R . 

3. Set z as the restriction of v on X p . 


Putting together. Similar to the previous section, we introduce the left boundary value g L and 
the right boundary value g R for a column-major ordering array g = \gip, ■ ■ ., <? Sl ,i, • • ■, g Sl , S2 ] T 
induced from some grid with size si x s 2 by 



9 L ■= [ffi.i, • 

1 T 

9 R 

— [ 51,82 i • • • 

1T 

i9si,s 2 \ ’ 


approximations for u p ,p = 

1,...,TO, 

can be 

defined step by step as 



:= Gf f q , q = 1 

,...,TO, 





'Up,l:p— 1 

— G R fi R 

■— u p a p-l,l:p-h 

u R - 

U 'p,l:p ’ 

-u R 

Ul p,l:p 

— 1 

for p = 2,.. 

,m 


— G L ii L 

u p+l,p+l:mi 

u L 

p,p:m 

-~U L 

u p,p 

+ u L 

' u, p,p-\-l:m’> 

for p = m 

-1, 

ILi 

•— ^1,1 ^1,2:777.5 






Up 

• = ^p,l:p— 1 p,p 

^p,p+l:77D P 

= 2 

-1, 


Um 

•— ^m,l:m— 1 H - 

,m • 






To solve the subproblems = g , H R v = g and H R v = g , we notice that they are indeed 

quasi-ID problems since 7 and b are some small constants. Therefore, for each one of them, we 
can reorder the system by grouping the elements along dimension 2 first and then dimension 1, 
which results a banded linear system that can be solved by the LU factorization efficiently. These 
factorization processes induce the factorizations for the operators G^' ! , G R and G R symbolically, 
which leads to our setup algorithm of the preconditioner in 2D as described in Algorithm [l] and the 
application algorithm as described in Algorithm [2] 


Algorithm 1 Construction of the 2D additive sweeping preconditioner of the Equation Q. Com¬ 
plexity = 0(n 2 (b + 7 f/b) = 0(N(b + 7 ) 3 /b). 

for q = 1 ,..., to do 

Construct the LU factorization of H^ 1 , which defines G^ 1 . 

end for 

for p = 2 ,..., to do 

Construct the LU factorization of H R , which defines G R . 

end for 

for p = 1 ,..., to — 1 do 

Construct the LU factorization of Hp, which defines Gp. 

end for 


To analyze the complexity, we note that, in the setup process, there are 0{n/b) subproblems, 
each of which is a quasi-ID problem with 0 {7 + b) layers along the second dimension. Therefore, 
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Algorithm 2 Computation of u ss Gf using the preconditioner from Algorithm [l] Complexity 

= Q(n 2 (b + 7 ) 2 /b) = Q(N(b + 7 Y/b). _ 

for g=l,..,,mdo 

u q , q = Gff q 

end for 

for p = 2,...,rado 
Mp,l:p-1 = GpU p _ 11:p _ 1 

u p 7 l:p u p,l:p— 1 ' 

end for 

for p = m — 1,..., 1 do 

Mp,p+l:m = G, p U p+ i p+1 . m 

= 7/^ -I- II^ 

p,p:m P,P p,p+l:m 

end for 

Ml = Ml.l + Ml,2:m 
for p = 2 ,..., to — 1 do 

Mp — Mppip— i -f- Mp^p -f Up p^i 

end for 

Mm M m? X:m—1 “f M rn.m 


the setup cost of each subproblem by the LU factorization is 0(n(y + 6 ) 3 ) and the application 
cost is 0(n(y + 6 ) 2 ). So the total setup cost is 0(n 2 (7 + b) 3 /b). Besides, one needs to solve each 
subproblem once during the application process so the total application cost is 0 (n 2 (7 + b) 2 /b). 

There are some differences when implementing the method practically: 

1. In the above setting, PMLs are put only on two opposite sides of the unit square for illustration 
purpose. In reality, PMLs can be put on other sides of the domain if needed. As long as there 
are two opposite sides with PML boundary condition, the method can be implemented. 

2. The thickness of the auxiliary PMLs introduced in the interior part of the domain needs not 
to be the same with the thickness of the PML at the boundary. In fact, the thickness of the 
auxiliary PML is typically thinner in order to improve efficiency. 

3. The widths of the subdomains are completely arbitrary and they need not to be the same. 
Practically, the widths can be chosen to be larger for subdomains where the velocity field 
varies heavily. 

4. The symmetric version of the equation can be adopted to save memory and computational 
cost. 

3.2 Numerical results 

Here, we present some numerical results in 2D to illustrate the efficiency of the algorithm. The 
proposed method is implemented in MATLAB and the tests are performed on a 2.0 GHz computer 
with 256 GB memory. GMRES is used as the iterative solver with relative residual equal to 10 ~ 3 
and restart value equal to 40. PMLs are put on all sides of the unit square. The velocity fields 
tested are given in Figure [4j 


16 





(a) A converging lens with a Gaussian profile at the center of the domain. 

(b) A vertical waveguide with a Gaussian cross-section. 

(c) A random velocity field. 




(a) (b) (c) 

Figure 4: The three velocity fields tested in 2D. 

For each velocity field, two external forces are tested: 

(a) A Gaussian point source centered at (1/2,1/8). 

(b) A Gaussian wave packet with wavelength comparable to the typical wavelength of the domain. 
The packet centers at (1/8,1/8) and points to the direction (l/-\/2, l/\/2). 

In these tests, each typical wavelength is discretized with 8 points. The width of the PML at 
the boundary and the one of the PMLs introduced in the interior parts of the domain are both 9 h, 
i.e., 7 = 9. The number of layers in each interior subdomain is b = 8, the number of layers in the 
leftmost subdomain is b + 7 — 1 = 16 and the one in the rightmost is b + 7 — 2 = 15. 

We vary the typical wave number lo/(2tt) and test the behavior of the algorithm. The test 
results are presented in Tables 00 and [3] T setU p is the setup time of the algorithm in seconds. 
T so ive is the total solve time in seconds and N [ ter is the iteration number. From these tests we see 
that the setup time scales like O(N) as well as the solve time per iteration, which is consistent 
with the algorithm complexity analysis. The iteration number remains constant or grows at most 
logarithmically, which shows the efficiency of the preconditioner. 
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force (a) force (b) 


velocity field (a) 

force (a) 

force (b) 


N 

^setup 

Ater 

Xsolve 

Ater 

^solve 

16 

127 2 

8.1669e—01 

4 

5.3199e—01 

4 

2.5647e—01 

32 

255 2 

3.4570e+00 

4 

7.3428e—01 

4 

7.2807e—01 

64 

511 2 

1.5150e+01 

5 

3.6698e+00 

4 

3.7239e+00 

128 

1023 2 

6.2713e+01 

5 

1.6812e+01 

4 

1.6430e+01 

256 

2047 2 

2.6504e+02 

6 

7.8148e+01 

4 

5.6936e+01 


Table 1: Results for velocity field (a) in 2D. Solutions with w/(2i r) = 32 are presented. 


4 Preconditioner in 3D 

4.1 Algorithm 

In this section we briefly state the preconditioner in 3D case. The domain of interest is D = (0, l) 3 . 
PMLs are put on two opposite faces of the unit cube, x 3 = 0 and *3 = 1, which results the equation 

j (d\ +d% + (s{x 3 )d 3 ) 2 + u ( x ) = f( x )> V x = (x 1 ,x 2 ,x 3 ) G D, 

[u(a:) = 0, \/x G dD , 

Discretizing D with step size h = l/(n + 1) gives the grid 


X := {(iih,i 2 h,i 3 h) : 1 < h,i 2 ,i 3 < n}, 


and the discrete equation 


(S i3 + 1 / 2 ^ ^ ^ Sia-l/2,A 

\ h + 1 “*li* 2,*3/ h V u il,i2,i3 if J 


-(-1,12 ,^3 2^1^2,13 T Ui 1 — 1,22,23 ^ ^H,*2 + l,?3 ^^q,«2,*3 


/l 2 

w 2 


h 2 

*li*2i*3 = /*1,*2,*3> — *1)*2 A 1 


( 5 ) 
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force (a) force (b) 


velocity field (b) 

force (a) 

force (b) 

oj/(2tt) 

N 

^setup 

X+er 

Xsolve 

Xiter 

^solve 

16 

127 2 

7.0834e—01 

6 

2.9189e—01 

4 

1.9408e—01 

32 

255 2 

3.2047e+00 

8 

1.6147e+00 

4 

7.9303e—01 

64 

511 2 

1.4079e+01 

8 

6.3057e+00 

4 

3.9008e+00 

128 

1023 2 

6.0951e+01 

8 

2.9097e+01 

4 

1.5287e+01 

256 

2047 2 

2.6025e+02 

8 

1.1105e+02 

5 

7.2544e+01 


Table 2: Results for velocity field (b) in 2D. Solutions with u/(2i r) = 32 are presented. 


u and / are defined as the column-major ordering of u and / on the grid X 

W •— [ui,l,l j * * * ; H77, 1,1 ) • • • ) ^n,n,l j • • • 5 ^77,77,77] 5 \f 1 , 1 , 1 ) • * ■ 7 fn, 1 , 1 ) • * • 5 fn,n,l i • • ■ 5 /n,n,n] • 

X is divided into to parts along the £3 direction 
Xi := {(ii/i, *2/1,*3/1) : 1 < ii < n, 1 < *2 < n, 1 < *3 < 7 + 6 — 1}, 

X p := 12k, i 3 h) : 1 < ii < n, 1 < *2 < n, 7 + (p — 1)6 < i 3 < 7 + p6 — 1 }, p = 2 ,... , to — 1 , 

X m := {(ii/i, 126, i 3 h) : 1 < *i < n, 1 < *2 < n, 7 + (m — 1)6 < «3 < 27 + mb — 2 }. 

Up and /p are the column-major ordering restrictions of u and / on X p 

Ui .— [u + 1,1, ■ • ■ , U n ,1.1. • • • ) U n , n ^ 1, • . . , 1177,77,7+6—1] , 

Mp • [^l,l,'y+(p—1)6) • • • ) ^n,l,7+(p—1)6) • • * ) ^n,n,7+(p—1)6) • • • ) Ht7,77,7+p 6— l] ) P 2, . . . , TO 1, 

- = [^l,l,7+(m—1)6) • • • ) ^n,l,7+(m—1)6) • • * } Ht7,77,7+(777—1)6) * • • ) 1177,77,27+7776—2] ) 

^1 * = [/l,l,l) • • ■ ) fn, 1,1) ■ ■ • ) fn,n,l> ■ ■ • ) /n,77,7+6— l] ) 

.fp * [/l,l,7+(p—1)6) * • * ) fn , l ,~ f +( p — 1)6) • • • ) f 77,77,7+(p—1)6) • ■ • ) f 77,77,7+p6—l] ) P 2, . . . , TO 1, 

.frri * = [/l,l,7+(777—1)6) • • • ) fn , l ,7+(777 —1)6) ■ • • ) ^77,77,7+ (777 —1)6) • ■ • ) fn , n , 27+7776—2] • 
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force (a) force (b) 


velocity field (c) 

force (a) 

force (b) 

oj/(2tt) 

N 

^setup 

Alter 

Xsolve 

Alter 

^solve 

16 

127 2 

7.0495e—01 

5 

2.4058e—01 

6 

2.8347e—01 

32 

255 2 

3.1760e+00 

5 

1.0506e+00 

5 

9.9551e—01 

64 

511 2 

1.4041e+01 

6 

4.7083e+00 

7 

6.7852e+00 

128 

1023 2 

6.1217e+01 

6 

1.8652e+01 

6 

1.9792e+01 

256 

2047 2 

2.5762e+02 

8 

1.1214e+02 

6 

8.6936e+01 


Table 3: Results for velocity field (c) in 2D. Solutions with w/(27r) = 32 are presented. 


Auxiliary domains. The extended subdomains, the extended grids, and the corresponding left 
and right boundaries are defined by 


D m 

q 

= (o,i) 

x (0,1) x 

((q - 1 )bh, 2r) + 

(qb- 

- l)h), 

q 

= 1 , • • 

. ,?n, 



d r 

V 

= (o,i) 

x (0,1) x 

{v + ((p - 

-1)5- 

1 )h, 

277 + (pb - 

1 )h), 

P = 

2, . . . ,m 


D l 

p 

= (0,1) 

x (0,1) x 

{{p-l)bh,rj+pbh), 

P = 1 

5 • • 

. , m — 

1 , 



d L D R 

= (0,1) 

x (0,1) x 

{v + ((p- 

- 1 ) 6 - 

1 )h}, p = 

2 , 

• •, m, 




d R D R 

= (0,1) 

x (0,1) x 

{g + pbh}, p = 

= 1,. 

., m — 

1, 





V M 

X q 

= {(*1 h 

i 2 h,i 3 h) 

1 < *i < 

n, 1 < 

*2 < 

n, (q- 

1)6 

+ 1 < 

i 3 < 

2 y + qb - 

- 1 }, 

v R 

Ap 

= {(iih 

i 2 h,i 3 h) 

1 < ii < 

n. 1 < 

i 2 < 

n, 7 + 

[P~ 

1)6 < 

h < 

2 y + pb - 

-2}, 

x p 

= {(*i h 

i 2 h,i 3 h) 

1 < ii < 

n, 1 < 

i 2 < 

n,{p- 

1)6 

+ 1 < 

i 3 < 

7 +pb - 

1 }, 


Auxiliary problems. For each q = 1 ,,m, H^v = g is defined as the discretization on X of 

| (&l + dl + (sf ( x 3 )d 3 ) 2 + v(x) = g(x), Vx G Df, 

|_u(a;) =0, \/x G dDq 1 , 
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For p = 2,..., to, HpV = g is defined as the discretization on X R of 

(df + d\ + (s R (x 3 )d 3 ) 2 + v{x) = 0, Vx £ D R , 

v(x) = w(xi, x 2 ), \/x £ d L D R , 

v{x)=0, \/x £ dD R \d L D R , 

where g := (— l/h 2 )[w T , 0,..., 0] T and w := ... ,w n p,... ,w n<v \ is the discrete boundary 

value. Finally, for p = 1,... ,rn — 1, H R v = g is the discretization on X R of 

+ d 2 + (sp(x 3 )d 3 ) 2 + v(x) =0, \/x £ D^, 

v(x) = w(xi,x 2 ), Vx £ d R Dp, 

v{x) = 0, \/x £ dD£ \ d R D%, 

where g := (-l/ft 2 )[0,... ,0,u; T ] T and w := [w u ,... ,w n>1 ,.. 

Auxiliary Green’s operators. For q = 1 G^ r : y i —> z is defined using the following 

operations: 

1. Introduce a vector g defined on X^ by setting y to X q and zero everywhere else. 

2. Solve H^v = g on X^ 1 . 

3. Set z as the restriction of v on X q . 

For p = 2,..., to, G R : w >->• z is given by: 

1. Set g = (— l/h 2 )[w T , 0,..., 0] T . 

2. Solve H R v = g on X R . 

3. Set z as the restriction of v on X p . 

Finally, for p = 1,..., m — 1, the operators G R : w z is introduced to be: 

1. Set g = (—l//i 2 )[0,... ,0,w T ] T . 

2. Solve HpV = g on X J p . 

3. Set z as the restriction of v on X p . 

Putting together. In the 3D case, g L and g R for the column-major ordering array 
g = [5i, 1 , 1 , • • ■ , 081 , 1 , 1 , • • • , 5 * 1 , 82 , 1 , • • • ,5*1,* 2 ,s 3 ] T induced from some 3D grid with size 
si x s 2 x S 3 are given by 

5 • [ 51 , 1 , 1 , ■ • ■, 5*i, 1 , 1 , • • •,5si,*2,i] , 5 • [.9i,i,S 3 , • • •, 5 * 1 , 1 ,* 3 , • • •, 5si,* 2 ,**] • 
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Algorithm 3 Construction of the 3 D additive sweeping preconditioner of the system ©■ Com¬ 
plexity = 0(n 4 (6 + 7 f/b) = 0(N 4 / 3 (b + 7 ) 3 /b). 

for q = 1,..., to do 

Construct the nested dissection factorization of H 41 , which defines Gg 1 ■ 

end for 

for p = 2,..., m do 

Construct the the nested dissection factorization of H p , which defines G p . 

end for 

for p = 1,..., m — 1 do 

Construct the the nested dissection factorization of Hp, which defines G p . 

end for 


Algorithm 4 Computation of u ss Gf using the preconditioner from Algorithm [3] Complexity 
= 0(n 3 log n(b + 7 ) 2 /b) = 0(N\ogN(b + 7 ) 2 / 6 ). 


for q = 1, ... ,to do 

ii — f ' M f 
U'q.q ^q J q 

end for 

for p = 2,..,,m do 

Mp,l:p-1 = GpU p _ ll .p _ 1 
u R = u^ 4 - ii R 

u p,l:p u p,l:p—l ' U p,p 

end for 


for p = to — 1 ,..., 1 do 

Wp,p+l:m = GpUp +1 p +1 . m 

I/ 4 ' : 71 ^ 

p,p:m PiP p,p-\-l:m 

end for 


Ml — Ml,l + Ul,2:m 

for p = 2,..., to — 1 do 

Up — Mp t i : p_i -f- Up : p 

end for 

Ura — l:m—1 


The subproblems H^v = g, H p v = g and H p v = g are quasi-2D. To solve them, we group the 
elements along dimension 3 first, and then apply the nested dissection method [Tlj !5] to them, as 
in [7]. This gives the setup process of the 3D preconditioner in Algorithm [ 3 ] and the application 
process in Algorithm |4j 

For the algorithm analysis, we notice that each quasi-2D subproblem has 0 {7 + b) layers along 
the third dimension. Therefore, the setup cost for each subproblem is 0((y + b) 3 n 3 ) and the 
application cost is 0((7 + b) 2 n 2 logn). Taking the total number of subproblems into account, the 
total setup cost for the 3D preconditioner is 0(n 4 (6 + 7 ) 3 /b) and the total application cost is 
0(n 3 log n(b + 7 ) 2 /b). 
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4.2 Numerical results 


Here we present the numerical results in 3D. All the settings and notations are kept the same 
with Section |3T2| unless otherwise stated. The PMLs are put on all sides of the boundary and the 
symmetric version of the equation is adopted to save memory cost. The PML width is rj = 9 h 
for the boundary and is ? 7 aux = 5 h for the interior auxiliary ones. The number of layers in each 
subdomain is b = 4 for the interior ones, b + 7 — 1 = 12 for the leftmost one and 6 + 7 — 2 = 11 for 
the rightmost one. 

The velocity fields tested are (see Figure [5]): 

(a) A converging lens with a Gaussian profile at the center of the domain. 

(b) A vertical waveguide with a Gaussian cross-section. 

(c) A random velocity field. 



(a) (b) (c) 

Figure 5: The three velocity fields tested in 3D. 

The forces tested for each velocity field are: 

(a) A Gaussian point source centered at (1/2,1/2,1/4). 

(b) A Gaussian wave packet with wavelength comparable to the typical wavelength of the domain. 
The packet centers at (1/2,1/4,1/4) and points to the direction (0, l/v^2, l/\/2). 

The results are given in Tables m and [ 6 ] From these tests we see that the iteration number 
grows mildly as the problem size grows. We also notice that the setup cost scales even better than 
0(N 4 / 3 ), mainly because MATLAB performs dense linear algebra operations in a parallel way, 
which gives some extra advantages to the nested dissection algorithm as the problem size grows. 

5 Conclusion 

In this paper, we proposed a new additive sweeping preconditioner for the Helmholtz equation 
based on the PML. When combined with the standard GMRES solver, the iteration number grows 
mildly as the problem size grows. The novelty of this approach is that the unknowns are split 
in an additive way and the boundary values of the intermediate results are utilized directly. The 
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force (a) 


force (b) 


velocity field (a) 

force (a) 

force (b) 


N 

T 

setup 

^iter 

-^solve 

-^iter 

-^solve 

5 

39 :i 

2.3304e+01 

3 

2.9307e+00 

4 

3.7770e+00 

10 

79 3 

3.2935e+02 

3 

3.6898e+01 

4 

4.6176e+01 

20 

159 2 

4.2280e+03 

4 

4.3999e+02 

4 

4.6941e+02 


Table 4: Results for velocity field (a) in 3D. Solutions with w/(27t) = 10 at X\ = 0.5 are presented. 



force (a) force (b) 


velocity field (b) 

force (a) 

force (b) 

w/(27t) 

N 

^setup 

Mter 

^solve 

-^iter 

^solve 

5 

39 3 

2.1315e+01 

3 

2.7740e+00 

3 

2.7718e+00 

10 

79 3 

3.4256e+02 

4 

4.4286e+01 

3 

3.4500e+01 

20 

159 2 

4.3167e+03 

5 

5.7845e+02 

4 

4.6462e+02 


Table 5: Results for velocity field (b) in 3D. Solutions with oj/(2tt) = 10 at X\ = 0.5 are presented. 


disadvantage is that, for each subdomains, three subproblems need to be built up, which is time 
consuming compared to l 7] and m ■ However, the costly parts of the algorithm, i.e. the whole 
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force (a) force (b) 


velocity field (c) 

force (a) 

force (b) 


N 

T 

setup 

Alter 

-^solve 

Alter 

-^solve 

5 

39 :i 

2.1063e+01 

4 

3.8074e+00 

4 

3.7975e+00 

10 

79 3 

3.4735e+02 

4 

4.4550e+01 

4 

4.5039e+01 

20 

159 2 

4.3391e+03 

4 

4.4361e+02 

5 

5.8090e+02 


Table 6: Results for velocity field (c) in 3D. Solutions with w/( 2n) = 10 at x\ = 0.5 are presented. 

setup process and the solve processes of the subproblems H^v = g , can be done in parallel. The 
only parts that must be implemented sequentially are the accumulations of the left-going and right¬ 
going waves, where only the solve processes of the subproblems H^v = g and H^v = g are involved, 
which are the cheapest parts of the algorithm. Besides, we think that the whole approximation 
process is simple and structurally clear from a physics point of view and the idea might be easy to 
be generalized to other equations. 

There are also some other directions to make potential improvements. First, other numerical 
schemes of the equation and other approximations of the Sommerfeld radiation condition can be 
used to develop more efficient versions of this additive preconditioner. Second, the parallel version 
of the nested dissection algorithm can be combined to solve large scale problems. Last, in the 3D 
case, the quasi-2D subproblems can be solved recursively by sweeping along the X 2 direction with 
the same technique, which reduces the theoretical setup cost to O(A) and the application cost to 
O(N). However, compared to |7], the coefficient of the complexity in this new method is larger, so 
it is not clear whether or not the recursive approach will be more efficient practically. Nevertheless, 
it is of great theoretical interest to look into it. 
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